orthogonal basis
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Health & Medicine (0.67)
- Energy (0.45)
Notation
We use [n ] to denote {1,,n } for an integer n . To control the norm in the last line of (A.7), we note that The proof is along the lines of Theorem 1. This following lemma is useful for our proof. Applying Hermite decomposition (A.27) and taking expectation, we have Each summand in (A.28) is an n n matrix where We use the properties of Hermite polynomials [3][ 18.18.11]: 's form an orthogonal basis, equipped with the inner product This basis follows the physicist's convention of Hermite polynomial.
PLAN: Proactive Low-Rank Allocation for Continual Learning
Wang, Xiequn, Zhuang, Zhan, Zhang, Yu
Continual learning (CL) requires models to continuously adapt to new tasks without forgetting past knowledge. In this work, we propose \underline{P}roactive \underline{L}ow-rank \underline{A}llocatio\underline{N} (PLAN), a framework that extends Low-Rank Adaptation (LoRA) to enable efficient and interference-aware fine-tuning of large pre-trained models in CL settings. PLAN proactively manages the allocation of task-specific subspaces by introducing orthogonal basis vectors for each task and optimizing them through a perturbation-based strategy that minimizes conflicts with previously learned parameters. Furthermore, PLAN incorporates a novel selection mechanism that identifies and assigns basis vectors with minimal sensitivity to interference, reducing the risk of degrading past knowledge while maintaining efficient adaptation to new tasks. Empirical results on standard CL benchmarks demonstrate that PLAN consistently outperforms existing methods, establishing a new state-of-the-art for continual learning with foundation models.
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Europe > France (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Beyond independent component analysis: identifiability and algorithms
Ribot, Alvaro, Seigal, Anna, Zwiernik, Piotr
Independent Component Analysis (ICA) is a classical method for recovering latent variables with useful identifiability properties. For independent variables, cumulant tensors are diagonal; relaxing independence yields tensors whose zero structure generalizes diagonality. These models have been the subject of recent work in non-independent component analysis. We show that pairwise mean independence answers the question of how much one can relax independence: it is identifiable, any weaker notion is non-identifiable, and it contains the models previously studied as special cases. Our results apply to distributions with the required zero pattern at any cumulant tensor. We propose an algebraic recovery algorithm based on least-squares optimization over the orthogonal group. Simulations highlight robustness: enforcing full independence can harm estimation, while pairwise mean independence enables more stable recovery. These findings extend the classical ICA framework and provide a rigorous basis for blind source separation beyond independence.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > New York (0.04)
- (3 more...)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Health & Medicine (0.67)
- Energy (0.45)
Clifford Group Equivariant Neural Networks
Ruhe, David, Brandstetter, Johannes, Forré, Patrick
We introduce Clifford Group Equivariant Neural Networks: a novel approach for constructing $\mathrm{O}(n)$- and $\mathrm{E}(n)$-equivariant models. We identify and study the $\textit{Clifford group}$, a subgroup inside the Clifford algebra tailored to achieve several favorable properties. Primarily, the group's action forms an orthogonal automorphism that extends beyond the typical vector space to the entire Clifford algebra while respecting the multivector grading. This leads to several non-equivalent subrepresentations corresponding to the multivector decomposition. Furthermore, we prove that the action respects not just the vector space structure of the Clifford algebra but also its multiplicative structure, i.e., the geometric product. These findings imply that every polynomial in multivectors, An advantage worth mentioning is that we obtain expressive layers that can elegantly generalize to inner-product spaces of any dimension. We demonstrate, notably from a single core implementation, state-of-the-art performance on several distinct tasks, including a three-dimensional $n$-body experiment, a four-dimensional Lorentz-equivariant high-energy physics experiment, and a five-dimensional convex hull experiment.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Health & Medicine (0.67)
- Energy (0.45)
Query Complexity of Active Learning for Function Family With Nearly Orthogonal Basis
Chen, Xiang, Song, Zhao, Sun, Baocheng, Yin, Junze, Zhuo, Danyang
Many machine learning algorithms require large numbers of labeled data to deliver state-of-the-art results. In applications such as medical diagnosis and fraud detection, though there is an abundance of unlabeled data, it is costly to label the data by experts, experiments, or simulations. Active learning algorithms aim to reduce the number of required labeled data points while preserving performance. For many convex optimization problems such as linear regression and $p$-norm regression, there are theoretical bounds on the number of required labels to achieve a certain accuracy. We call this the query complexity of active learning. However, today's active learning algorithms require the underlying learned function to have an orthogonal basis. For example, when applying active learning to linear regression, the requirement is the target function is a linear composition of a set of orthogonal linear functions, and active learning can find the coefficients of these linear functions. We present a theoretical result to show that active learning does not need an orthogonal basis but rather only requires a nearly orthogonal basis. We provide the corresponding theoretical proofs for the function family of nearly orthogonal basis, and its applications associated with the algorithmically efficient active learning framework.
- North America > United States > California (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.61)